Imputation for Repeated Bounded Outcome Data: Statistical and Machine-Learning Approaches

نویسندگان

چکیده

Real-life data are bounded and heavy-tailed variables. Zero-one-inflated beta (ZOIB) regression is used for modelling them. There no appropriate methods to address the problem of missing in repeated outcomes. We developed an imputation method using ZOIB (i-ZOIB) compared its performance with those naïve machine-learning methods, different distribution shapes settings designed simulation study. The was measured employing absolute error (MAE), root-mean-square-error (RMSE) unscaled mean relative (UMBRAE) methods. results varied depending on missingness rate mechanism. i-ZOIB ANN, SVR RF showed best performance.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Accuracy evaluation of different statistical and geostatistical censored data imputation approaches (Case study: Sari Gunay gold deposit)

Most of the geochemical datasets include missing data with different portions and this may cause a significant problem in geostatistical modeling or multivariate analysis of the data. Therefore, it is common to impute the missing data in most of geochemical studies. In this study, three approaches called half detection (HD), multiple imputation (MI), and the cosimulation based on Markov model 2...

متن کامل

Machine Learning Approaches for Dealing with Limited Bilingual Data in Statistical Machine Translation

Statistical machine translation (SMT) systems have made great strides in translation quality. However, high quality translation output is dependent on the availability of massive amounts of parallel text in the source and target language. There are a large number of languages that are considered “low-density”, either because the population speaking the language is not very large, or even if mil...

متن کامل

Missing data imputation using statistical and machine learning methods in a real breast cancer problem

OBJECTIVES Missing data imputation is an important task in cases where it is crucial to use all available data and not discard records with missing values. This work evaluates the performance of several statistical and machine learning imputation methods that were used to predict recurrence in patients in an extensive real breast cancer data set. MATERIALS AND METHODS Imputation methods based...

متن کامل

Machine Learning Approaches for Dealing with Limited Bilingual Training Data in Statistical Machine Translation

Statistical Machine Translation (SMT) models learn how to translate by examining a bilingual parallel corpus containing sentences aligned with their human-produced translations. However, high quality translation output is dependent on the availability of massive amounts of parallel text in the source and target languages. There are a large number of languages that are considered low-density, ei...

متن کامل

Statistical Machine Learning from Data

NOTE: A good introduction to various machine learning models. NOTE: The theory is explained here with all the equations. [4] Vladimir N. Vapnik. The nature of statistical learning theory. Springer, second edition, 1995. NOTE: A good introduction to the theory, not much equations. NOTE: Very good paper proposing a series of tricks to make neural networks really working.

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Mathematics

سال: 2021

ISSN: ['2227-7390']

DOI: https://doi.org/10.3390/math9172081